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Abstract - One of the main characteristics of real-world networks is their large clustering. Clus- 
tering is one aspect of a more general but much less studied structural organization of networks, 
i.e. edge multiplicity, defined as the number of triangles in which edges, rather than vertices, 
participate. Here we show that the multiplicity distribution of real networks is in many cases 
scale-free, and in general very broad. Thus, besides the fact that in real networks the number of 
edges attached to vertices often has a scale-free distribution, we find that the number of triangles 
attached to edges can have a scale-free distribution as well. We show that current models, even 
when they generate clustered networks, systematically fail to reproduce the observed multiplicity 
distributions. We therefore propose a generalized model that can reproduce networks with ar- 
bitrary distributions of vertex degrees and edge multiplicities, and study many of its properties 
analytically. 



Introduction. — Real networks, where nodes (or ver- 
tices) are intricately connected by links (or edges), are 
characterized by complex topological properties such as 
a scale-free distribution of the degree (number of edges 
reaching a vertex), degree-degree correlations, and nonva- 
nishing degree-dependent clustering (density of triangles 
reaching a vertex) [T]. Understanding the structural and 
dynamical properties of complex networks strongly relies 
on the possibility to investigate theoretical models which 
are both realistic and analytically solvable. Several ana- 
lytically solvable models reproducing the most important 
local property of real networks, i.e. the degree distribu- 
tion, have been proposed [Tj. However, models repro- 
ducing higher-order properties including clustering (also 
called transitivity) are only a few and are either entirely 
computational [2l|3] (i.e. not analytically solvable) or 
solvable only for particular cases, e.g. when triangles are 
non-overlapping [JHZ] or when the network is made by 
cliques [5] or other subgraphs [S] embedded in a tree- 
like skeleton. Unfortunately, real networks generally vio- 
late the above particular conditions, as empirical analyses 
have revealed and as we will further show in what follows. 



Moreover, it has been shown that clustering is only one 
aspect of a more general topological organization which 
is best captured by edge multiplicity pilSlllOj. i.e. the 
number of triangles in which edges, rather than vertices, 
participate. Besides being more informative than vertex- 
based clustering, edge multiplicity strongly determines the 
percolation properties of networks fTTl and their commu- 
nity structure [T^ . 

A model with arbitrary edge multiplicities. In 

order to overcome these limitations, here we propose an 
analytically solvable model of networks with no restriction 
on their clustering properties, and able to generate edges 
of any multiplicity. Let us denote by m(i,j) the multiplic- 
ity of the edge (i,j), i.e. m{i,j) = J^k^i.j o-ikUkj where 
aij = 1 if a link between vertices i and j is there, and 
aij = otherwise. In our model we allow each vertex i 
to have k^^^ edges of zero multiplicity, fcp^ edges of mul- 
tiplicity 1 and so on, up to fc,|*''^ edges of multiplicity M, 
where M = N — 2 is the maximum possible multiplicity in 
a network with N vertices. Thus each vertex i is assigned 



a {N — l)-dimensional vector ki = {kf^\ ...,k'^^" '), that 
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we denote as the generalized degree, specifying the multi- 
phcity structure in the neighborhood of i. The ordinary 
degree of vertex i is h = J2m=o ^i"*^ ■ Accordingly, we 
consider the ensemble of random networks with a specified 
distribution P(fe) = . . . , of generalized 

degrees. 

Our approach reduces to various previously proposed 
models in special cases, but is more general and allows to 
analytically investigate more realistic regimes which have 
not been explored so far. 

• If fci = 0, . . . , 0), our model reduces to the con- 
figuration model |131I14] where each vertex i has a 
specified degree ki = kl^\ and the network is locally 
tree- like (edges have zero multiplicity). This model 
has vanishing clustering in the thermodynamic limit 

— >■ oo, and is thus inadequate to reproduce clus- 
tered networks. 

• If fci = fc-^-*, 0, . . . , 0), Newman's clustered 
model 1^ is recovered, where each vertex i has at- 
tached fcj-°^ — Si "single" edges and fc|^' — 2ti edges 
belonging to U triangles. Although this model has a 
finite clustering for A'^ — cx), it can only produce net- 
works in the weak transitivity regime [HIS] , i.e. where 
the clustering coefficient of a vertex with degree k is 
c{k) < {k — (see figure [T^). 

• If fci = (0, fc|^\ 0, . . . , 0), we recover the model by Shi 
et al. [7] where all triangles are closed. This model is 
the maximally clustered version of Newman's model, 
i.e. c{k) = (fc — but still cannot produce strong 
transitivity. 

• If k, = (fcf\0, ...,0,fc|'^"^\0, ...,0) we recover 
Gleeson's model |8] where each vertex i belongs to a 
clique of c vertices (and thus has fc^-^ = c — I links 
of multiplicity c — 2) and has fcl"'' = fc^ — c + 1 addi- 
tional external links of zero multiplicity, thus forming 
a network where cliques are embedded in a tree-like 
structure. Although this model can produce networks 
with strong transitivity, it forces any vertex to belong 
to only one clique. Thus it fails to reproduce net- 
works with overlapping communities of densely inter- 
connected vertices [12j . 

• Finally, if ki = (fc|°^ , k[^^ , , . . . , 0) we recover the 
model recently proposed by Karrer and Newman [9] 
where, in addition to single edges and edges belong- 
ing to triangles, edges belonging to diamonds (thus 
with multiplicity 2) are also introduced. More gen- 
erally, that model allows to embed any type of small 
subgraphs into a higher-order tree-like structure, and 
can thus produce strong transitivity as in Gleeson's 
model. However, the model can only be applied as 
long as the set of specified subgraphs is fixed a pri- 
ori, and its analytical complexity grows rapidly with 




Fig. 1: a) Maximally chistered configuration (c = 1/3) allowed 
for for the top vertex {k — 4) in networks with non-overlapping 
triangles (weak transitivity) such as Newman's model [6]. b) 
Maximally clustered configuration (c — 1) for the top vertex 
{k = 4) in networks with overlapping triangles (strong transi- 
tivity), which is achieved in our model by assigning that vertex 
a generalized degree k — (0, 0, 0, 4, 0, . . .). 

the number and size of the subgraphs considered. The 
empirical results that we will show in a moment make 
this approach inadequate to reproduce real networks. 

Edge multiplicity in real networks. In all the 

above models, the fraction $(to) of edges with multiplicity 
m is fully concentrated on the smallest possible values, i.e. 
m = 0, 1, 2 depending on the particular model (except in 
Gleeson's model, where a broader distribution of multiplic- 
ities can be obtained with a suitable choice of clique sizes, 
however losing an important degree of freedom required 
in order to fit other properties of real networks [5]). It is 
important to compare this prediction with the multiplicity 
structure of real networks. In figl2]we show the cumula- 
tive edge multiplicity distribution $>(m) = X]n>m ^("') 
for various real networks. Wc find that sparse networks, 
such as the Internet and metabolic networks, display a 
power-law distribution of edge multiplicities (with simi- 
lar exponents). Denser networks such as the World Trade 
Web show a distribution which is peaked at some very 
large value (see inset). 

In these and all other cases shown, the distributions are 
broad and extend over many orders of magnitude, in sharp 
contrast with the predictions of the above models. In 
particular, scale-free multiplicity distributions imply that, 
in models with modules embedded in tree-like structures, 
subgraphs of any size should be attached to vertices in or- 
der to reproduce the observed multiplicity structure. In 
this situation, such models become analytically intractable 
and their very philosophy becomes inappropriate. Indeed, 
the empirical results shown above suggest that network 
formation is much more decentralized than assumed by 
locally generating non-overlapping modules of fixed size 
and sparsely connecting them to one another. The con- 
cept of module itself appears vague, due to the lack (or 
to the unreasonable largeness) of a typical scale for the 
subgraphs required to describe the network. Remarkably, 
besides the fact that in real networks the number of edges 
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Fig. 2: Cumulative edge multiplicity distributions $>(m) for 
various real networks. Inset: histogram of edge multiplici- 
ties (non-cumulative distribution) for the World Trade Web 
(WTW), as an example of network with unusually high den- 
sity. 



attached to vertices often has a scale-free distribution, we 
found that the number of triangles attached to edges can 
have a scale-free distribution as well. 

Generating functions and clustering. Our 

model, by allowing k to have a more general structure, 
can span the entire multiplicity spectrum without explic- 
itly introducing subgraphs, overcoming the limitations of 
the aforementioned models (see figure [Dd). The ordinary 
degree distribution is 



(1) 



The generating function of the probability P{k) is 

giz)=J2i^Ak)Pik) (2) 



where z/\k^ UtLo ^m"' and g{z) = g{zo, zm)- The 
generating function of the degree distribution is 



We can now compute the transitivity of the network. First 
we need to count the triangles: 
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z = l 



/ = iV/.V.9(2) U=i (4) 



where e,„ is a unit vector of multiplicity m (i.e. eg = 
(1,0, ...,0), ei = (0, 1,0, ...,0), etc.) and I = Em=o™<2„. 
The total number of connected triples is 



Y dz^ 



(5) 



so that the transitivity, which is defined as T = SA^aZ-^s, 
does not disappear when iV — > oo. Therefore, as ex- 
pected, our model successfully produces networks with 
non-vanishing overall clustering. It can also generate any 
desired clustering spectrum, i.e. the average clustering 
c{k) of vertices with degree k. The latter is 



:(fc) = 



1 



N 

E 



2NAii) 
Np{k) ^ k{k - 1) 



(6) 



where NA{i) is the number of mutually connected neigh- 
bors of vertex i. This leads to 



k{k 



—c{k)p{k) = E I-kP{k). 



(7) 



lk=k 



The above relations hold for every network. It is thus 
possible to choose P{k) in order to reproduce both p{k) 
and c(fc) as in other models 



Percolation properties. — Importantly, we can 
study the percolation properties of our model analytically, 
thus extending previous results [31ini[Hl[II] to more gen- 
eral cases. Let D{s\k) be the probability that a vertex of 
generalized degree fe is a member of a set of s mutually 
reachable vertices. Similarly, let d{s\k) be the probability 
that a vertex connected to a vertex v of generalized degree 
k can reach s other vertices, excluding the vertex v and its 
neighborhood. The relation between D{s\k) and d{s\k) is 



D{s\k) 



E 



d{si\k) ■ ■■d{sk\k)5s 



1 + S1+... + S&- 



(8) 



We can also write a recursion relation for d{s\k) as 

rn-in{h^k) — l 

E E PihMk) 

h m— 

X E d{si\h)---d{sh^\h)5s^+si+...+sni^) 

where p{h,m\k) represents the probability to select, 
around a vertex of generalized degree fc, an edge of multi- 
plicity m leading to a vertex of generalized degree h. The 
reduced degree hr is the number of vertices attached to 
the destination vertex except itself and the neighborhood 
of the first vertex i.e. = h — m — 1. If degree-degree 
correlations can be neglected, p{h,m\k) reads 



p(/i, m\k) 



fc(™) h^"^^P{h) 



k 



(10) 



The first fraction in cq.([TU]) represents the probability to 
leave a vertex of generalized degree k following an edge of 
multiplicity m. The second fraction is the probability to 
reach a vertex of generalized degree h following that edge. 
We can also use eq.® to write the generating functions 
d{z\k) = z^d(s|fc) of the probabilities d: 



min(h,k) — l 



d{z\k) 



h 



p{h,m\k,) d{z\h) 



-1 hf. 



(11) 
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If eq. pT|) has a stable solution d{z = l|fc) < 1 the network 
percolates. In order to study the stability of eq. pT|) 
around z = 1 we can study a perturbative solution d{z = 
l|fc) Ri 1 + x(A;)e in the limit e — ?• 0, which yields 



min{h,k) — l 

^ p{h,m\k){h~m^l)x{h) (12) 

h m—0 



m=0 



j2[a-m-^)-a-{i*f3)]Pih)xih) 



where ct 



k/k, (3 = Y.m-^U^^ra and / * /3 = 
ImPm^m IS a vcctor. The percolation transition 
occurs when the maximum eigenvalue of the matrix in 
eq. p^ is larger than 1 i.e Kmax > 1- Thus we have ob- 
tained an analytical expression for the percolation tran- 
sition, more general than the one known for networks in 
the weak transitivity regime [5] , and valid for any level of 
clustering and multiplicity. 

Rich-club effect. — As another example of the ef- 
fects of broad edge multiplicities, we now consider the 
rich-club coefficient R{k), defined as the observed num- 
ber of edges Ey{k) among the iV>(fc) = Np^{k) vertices 
with degree larger than k (where Py{k) is the cumula- 
tive degree distribution) , divided by the maximum allowed 
number A^>(fc)(iV>(fc) - l)/2 w Af2p2^(fc)/2 (TSHII]- In 
random networks with given degree distribution, the rich 
club behaves approximately as R{k)Rand ~ -j^^ [IS], so 
that the measured R{k) must be compared to this non- 
constant value. We now consider the case when, as in 
our model, one also specifies a multiplicity distribution 
$(m). Since every edge (i, j) with multiplicity m{i,j) > k 
surely connects two vertices i,j with degrees fci,fcj > fc, 
the expected value of E^(k) now receives a contribution 
£'$>(/c) from edges with multiplicity m > k (where E is 
the total number of edges), and the standard approxima- 
tion can only be applied to the remaining E{1 — $>(fc)) 
edges. Following [TB], we obtain the modified expectation 



R{k) 



Rand 



i>>(fc) 



(fc) 



NpUk) 



+ [l-$>(fc)] 



{k)N 



(13) 



If, as in some of the networks considered above, the cu- 
mulative distributions ^>{k) and p>{k) are power laws 
with exponents —a and —7 respectively, the asymptotic 
behavior of the first summand is ^ /c^'*'"" thus reducing 
or increasing the predicted scaling ~ k^. 

Graphic generaUzed degree sequences. There 
have been many attempts in the literature to generate 
null models of real networks by generating ensembles of 
random graphs with given properties. Some of these ap- 
proaches make use of generating functions [4l[5l[T4|, as in 
the present paper. Other approaches aim at constructing 



randomized ensembles computationally, and generate so- 
called microcanonical ensembles [19H21| of networks with 
sharp constraints. Finally, other approaches aim at de- 
scribing random networks with given properties analyt- 
ically, and generate (grand) canonical ensembles of net- 
works with soft constraints P^27j . 

If our model is used as a null model for a particular real 
network, it gives predictions about the ensemble of ran- 
dom graphs having the same generalized degree sequence 
{ki]^=i as the real network. This generalizes the config- 
uration model [ISIIII] where only the ordinary degree se- 
quence {ki}f^T^ is specified. In the latter case, if {ki}f^-^ is 
taken from a real network, one is sure that it is a graphic 
sequence. Otherwise, if one generates it artificially, one 
must enforce specific conditions, given by the Erdos-Gallai 
P5] and Havel-Hakimi [53] theorems, ensuring that the se- 
quence is graphic. In our case, the realizability of {ki}f^^ 
is much more complicated than in the case of ordinary 
graphic degree sequences, but we now show how it can be 
related to the standard problem. For convenience, we de- 
fine the iV X (TV — 1) matrix Q with entries Qij = k'f~'^\ 
The row and column sums (i.e. the margins) of Q are 
the degree sequence and the (unnormalized) multiplicity 
distribution respectively: 



N-l 



N-2 



Qi+ = E '5y = E 



(m) 



N 



m=0 
N 



Q 



+3 



(i-i) 



2E^^- 



(14) 
(15) 



where denotes the total number of edges with mul- 

tiplicity m. Therefore, as a first condition we find that 
the marginal (ordinary) degree sequence {ki]^^i must be 
graphic. There are however strong additional constraints. 
First note that, since we can always partition the edges 
in disjoint sets (each with given multiplicity), each of the 
M sequences {^i™'}^! must be separately graphic. This 
introduces constraints along each column of Q. More- 
over, since edge multiplicities must be consistent with each 
other, there are also constraints along each row of Q. 

A useful mapping allows us to solve the problem. For a 
given vertex i, we consider the subgraph F; whose vertices 
are the neighbors of i and edges are their mutual connec- 
tions. An example is shown in Fig|5] (note that F^ does 
not contain vertex i itself). If we denote by [x\i the value 
of a topological property x (e.g. the number E of edges, 
or the link density D = 2E/[N{N — 1)]) when measured 
on the subgraph F^, we find important relations, e.g. 



\k. 



m{i,j) 



(16) 



In other words, the number of vertices and link density 
of Fj; coincide with the degree and clustering coefficient 
of vertex i measured on the whole network respectively. 
Similarly, the degree of vertex j in F^ coincide with the 
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Fig. 3: On the left side, a generic network with TV = 6 vertices is 
shown, and the edges attached to vertex i are highlighted (cyan 
dashed edges). The (A'^ — 1) -dimensional generalized degree of 
vertex i is in this case fc; = (1,1, 2, 1, 0) (since the multiplicities 
of the dashed edges are niia = 2, rriib = 2, rriic = 3, mid = 1, 
mie = 0, and there is no edge with maximum multiplicity 
A'^ — 2 = 4). On the right side, the i-associated subgraph F; 
is shown. The degree of each vertex j in F^ coincides with the 
multiplicity m(i, j) of the edge connecting j to i in the original 
graph on the left. 



multiplicity m{i,j) in the whole network. Since there are 
k!f^^ vertices in Ti whose degree [kj\i equals m, fcj-™^ is 
the unnormalized degree distribution of F^, and the asso- 
ciated degree sequence {[kj\i\i={m{i, must therefore 
be graphic. This observation enforces the required con- 
straints along the rows of Q (and also shows that our 
model, by specifying the entire degree distribution of F^, 
is a sort of configuration model for each graph F^; by con- 
trast, models that specify the clustering coefficient Ci alone 
are a sort of Erdos-Renyi random graph reproducing only 
the link density of F^) . Taking the two conditions together, 
we find that a necessary condition for a generalized degree 
sequence to be graphic is that, for fixed m, fcl™^ is 

a graphic degree sequence and, for fixed fc^'"-' is a graphic 
degree distribution. This Sudoku-\i]ie condition operates 
along each row and column of Q simultaneously. 

Conclusions. — In this paper we have shown that real 
networks display broad, and often scale-free, edge multi- 
plicity distributions. Existing models cannot reproduce 
such feature and are therefore inadequate to predict var- 
ious properties of real networks. We have therefore in- 
troduced a model for networks with arbitrary generalized 
degree sequences. Unlike previous approaches, our model 
can take as input detailed information about the observed 
multiplicity structure to give refined analytical predictions 
about various network properties. We have finally ex- 
ploited a useful mapping to give necessary conditions for 
a generalized degree sequence to be graphic. 
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